Search Result

Journals

Publication Years

Keywords

Please wait a minute...

For Selected:

Download Citations
EndNote Ris BibTeX

Toggle Thumbnails

Select

Multi-modal summarization model based on semantic relevance analysis

Yuxiang LIN, Yunbing WU, Aiying YIN, Xiangwen LIAO

Journal of Computer Applications 2024, 44 (1): 65-72. DOI: 10.11772/j.issn.1001-9081.2022101527

Abstract （227）

HTML （3）

PDF （2804KB）（149）

Save

Multi-modal abstractive summarization is commonly based on the Sequence-to-Sequence （Seq2Seq） framework， and the objective function optimizes the model at the character level， which searches locally optimal results to generate words and ignores the global semantic information of the summary samples. It may cause a problem of semantic deviation between the summary and multimodal information， resulting in factual errors. In order to solve the above problems， a multi-modal summarization model based on semantic relevance analysis was proposed. Firstly， the summary generator based on Seq2Seq framework was trained to generate candidate summaries with semantic multiplicity. Secondly， a summary evaluator based on semantic relevance analysis was applied to learn the semantic differences among candidate summaries and the evaluation mode of ROUGE （Recall-Oriented Understudy for Gisting Evaluation） from a global perspective， so that the model could be optimized at the level of summary samples. Finally， the summary evaluator was used to carry out reference-free evaluation of the candidate summaries， making the finally selected summary sample as similar as possible to the source text in semantic space. Experiments on benchmark dataset MMSS show that the proposed model can improve the evaluation indexes of ROUGE-1， ROUGE-2 and ROUGE-L by 3.17， 1.21 and 2.24 percentage points respectively compared with the current optimal MPMSE （Multimodal Pointer-generator via Multimodal Selective Encoding） model.

Table and Figures | Reference | Related Articles | Metrics